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DETAILED ACTION 

Claim Rejections - 35 USC § 103 

1 . The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

2. Claims 59, 62, 64, 66, 69, 71 , 73, 76, and 78 are rejected under 35 U.S.C. 103(a) 
as being unpatentable over Hughes et al. in view of Jacobs et al. 

Concerning independent claim 59, Hughes etal. discloses a system for 
accessing a remote voice recognition resource on a server ("a communication module") 
from a telephone ("a local device"), comprising: 

"a communication module operable to receive input from the local device and to 
transmit data to the local device to enable the local device to provide the data in an 
output response to a user of the local device" - server system 300 is attached to the 
LAN 250 via network interface card 310 (column 5, lines 37 to 50: Figure 1); server 
system 300 is equivalent to "a communication module"; a telephone from which a caller 
is calling via the telephone network is "the local device"; a recognition resource remains 
in a Wait_Event state, and processes an incoming telephone signal when a recognized 
word or phrase is spoken ("operable to receive input from the local device") (column 8, 
lines 14 to 33: Figure 3); a prompt is played out to the caller ("to transmit data to the 
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local device to enable the local device to provide the data in an output response to a 
user of the local device"), via a state table action (column 8, lines 59 to column 9, line 2; 
column 9, lines 29 to 45: Figure 4); a prompt is data transmitted to the local device; 

"wherein the communication module is further operable to detect an additional 
user input from the local device and in response, to cause the local device to cease 
providing the output response to the user" - for barge-in, an application can specify that 
prompt output should be terminated in response to voice input (column 9, lines 29 to 45: 
Figure 4); barge-in, or cut-through, is a facility that is particularly useful for a voice 
processing application such as voice mail, where the caller is likely to encounter the 
same sequence of prompts repeatedly, and accordingly may be able to select a desired 
option without needing to listen to all of the prompt (column 8, line 59 to column 9, line 
2); a voice input from a caller to barge-in during a prompt is "an additional user input" 
causing the prompt to cease; 

"a processing module coupled to the communication module and operable to 
perform speech recognition on the received input" - speech recognition software 320 
("a processing module") resides on, and is supported by, server system 300 ("the 
communication module") (column 5, lines 37 to 50: Figure 1). 

Concerning independent claim 59, Hughes etal. is concerned with processing 
telephony data between a server performing speech recognition and a client calling 
from a telephone. Implicitly, a voice processing system performs activities "for directing 
an action" on a caller's telephone, including playing a prompt, displaying text, and 
directing a call. Hughes et al. does not expressly disclose a limitation of "wherein the 
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communication module is further operable to transmit a control signal to the local device 
for directing an action in a primary functionality component of the local device". 
However, Jacobs et al. teaches a distributed voice recognition system, where a central 
communications center 42 ("the communication module") receives speech features from 
a portable phone 40 ("the local device"), and central communications center provides 
speech features to a word decoder 48, which determines a linguistic estimate of the 
speech by speech recognition. Then, a command signal ("a control signal") is 
transmitted to portable phone 40, which decodes the signal, and provides the command 
signal to control element 38, which in response to the command signal, provides an 
intended response ("for directing an action") of e.g., dialing a phone number or providing 
information to display on the portable phone ("a primary functionality component of the 
local device"). (Column 5, Lines 44 to 67: Figure 2) Here, a "primary functionality" of 
portable phone 40 is to make telephone calls or display information, although portable 
phone serves as a communication device for accessing speech recognition services 
from communication center 42, too. Jacobs etal. states that advantages include 
reducing cost in the cellular telephone because word decoder hardware no longer 
resides at telephone 40 and an improvement in recognition accuracy. (Column 5, Lines 
1 2 to 21 ) It would have been obvious to one having ordinary skill in the art to transmit a 
control signal from a communications center to direct an action in a primary functionality 
component of a local device as taught by Jacobs et al. in a voice processing system of 
Hughes et al. for a purpose of reducing cost of a cellular telephone by placing word 
decoder hardware at a communication center. 
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Concerning independent claims 66 and 73, Hughes et al. discloses a method and 
computer program product (column 4, line 36) for accessing a remote voice recognition 
resource on a server from a telephone, comprising: 

"receiving an audio input from a local device, the audio input based on speech 
input issued by a user" - a recognition resource receives an incoming telephone signal 
of a word or phrase (column 8, lines 13 to 33: Figure 3); a word or a phrase spoken by a 
caller is "an audio input" and "speech input issued by a user"; a telephone from which a 
caller is calling via the telephone network is "the local device"; 

"performing speech recognition on the received audio input" - speech recognition 
software on server system 300 processes the incoming telephone signal, until it has 
recognized the word or phrase spoken, and returns recognized text; an application 
remains in a Wait_ Event state until a word or phrase is received (column 8, lines 13 to 
33: Figure 3); 

"transmitting data to the local device to enable the local device to provide the 
data in an output response to the user" - a prompt is played out to the caller (column 8, 
line 59 to column 9, line 2; column 9, lines 29 to 45: Figure 4); 

"detecting an additional audio user input from the local device" - a caller is 
allowed to make a spoken interruption of the prompt in a barge-in or cut-through facility 
(column 8, line 59 to column 9, line 2); 

"transmitting a signal to the local device to cause the local device to cease 
providing the output response to the user" - a state table action allows an application 
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designer to specify that prompt output should be stopped in particular eventualities; for 
barge-in, an application can specify that prompt output should be terminated in 
response to voice input; one such eventuality is where the caller inputs a DTMF tone, 
which is recognized by appropriate software (column 9, lines 29 to 45: Figure 4); 
terminating a prompt output is equivalent to a signal that is transmitted "to the local 
device to cause the local device to cease provision of the output response to the user". 

Concerning independent claims 66 and 73, Hughes et al. is concerned with 
processing telephony data between a server performing speech recognition and a client 
calling from a telephone. Implicitly, a voice processing system performs activities "for 
directing an action" on a caller's telephone, including playing a prompt, displaying text, 
and directing a call. Hughes et al. does not expressly disclose a limitation of 
"transmitting a control signal to the local device for directing an action in a primary 
functionality component of the local device". However, Jacobs et al. teaches a 
distributed voice recognition system, where a central communications center 42 ("the 
communication module") receives speech features from a portable phone 40 ("the local 
device"), and central communications center provides speech features to a word 
decoder 48, which determines a linguistic estimate of the speech by speech recognition. 
Then, a command signal ("a control signal") is transmitted to portable phone 40, which 
decodes the signal, and provides the command signal to control element 38, which in 
response to the command signal, provides an intended response ("for directing an 
action") of e.g., dialing a phone number or providing information to display on the 
portable phone ("a primary functionality component of the local device"). (Column 5, 
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Lines 44 to 67: Figure 2) Here, a "primary functionality" of portable phone 40 is to make 
telephone calls or display information, although portable phone serves as a 
communication device for accessing speech recognition services from communication 
center 42, too. Jacobs et al. states that advantages include reducing cost in the cellular 
telephone because word decoder hardware no longer resides at telephone 40 and an 
improvement in recognition accuracy. (Column 5, Lines 12 to 21) It would have been 
obvious to one having ordinary skill in the art to transmit a control signal from a 
communications center to direct an action in a primary functionality component of a 
local device as taught by Jacobs et al. in a voice processing system of Hughes et al. for 
a purpose of reducing cost of a cellular telephone by placing word decoder hardware at 
a communication center. 

Concerning claims 62, 69, and 76, Hughes etal. discloses playing out a prompt 
to a caller (column 8, line 59 to column 9, line 2; column 9, lines 29 to 45: Figure 4); a 
prompt is "audio data" that is transmitted to the remote device, i.e. a caller calling from a 
telephone. 

Concerning claims 64, 71 , and 78, Hughes etal. discloses that a caller is calling 
from a telephone ("the local device") (column 1, lines 8 to 25); implicitly, a caller's 
telephone is not capable of processing a caller's voice input by speech recognition. 
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3. Claims 61 , 63, 65, 68, 70, 72, 75, 77, and 79 are rejected under 35 U.S.C. 103(a) 
as being unpatentable over Hughes et al. in view of Jacobs et al. as applied to claims 
59, 66, and 73 above, and further in view of Houser et al. 

Concerning claims 61 , 63, 68, 70, 75, and 77, Hughes et al. discloses a 
commonly known voice processing system for speech recognition at a remote server 
from voice input at a caller's telephone that provides audio prompts and has a barge-in 
facility, but omits transmitting data including video data and a text message. However, 
it is known to obtain various forms of information by an interface through speech 
recognition, as in management of voice mail by speech recognition from a telephone. 
Specifically, Houser etal. teaches an information system having a speech interface, 
where a terminal unit 16 includes a processor for executing a speech recognition 
algorithm to recognize spoken commands for accessing information transmitted by 
information distribution system 12. Information distribution system 12 supplies or 
broadcasts information to a terminal unit 16, where "information" includes, but is not 
limited to, analog video, analog audio, digital video, digital audio, text services, such as 
news articles, sports scores, stock market quotations, and weather reports, electronic 
messages ("a text message"), electronic program guides, database information, and 
software including game programs. (Column 5, Line 39 to Column 6, Line 14: Figure 1) 
An objective is to provide a subscriber with access to information by a speech 
recognition interface, which enhances the interface by allowing control using language 
naturally spoken by the subscriber for implementation of tasks not easily implemented 
using menu screens and key presses. (Column 2, Lines 19 to 29) It would have been 
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obvious to one having ordinary skill in the art to provide data to a subscriber in the form 
of information including video and a text message as taught by Houseret al. in a voice 
processing system including a barge-in facility of Hughes et al. for a purpose of 
providing a subscriber with access to information via a speech recognition interface for 
implementing tasks not easily performed by menu screens and key presses. 

Concerning claims 65, 72, and 79, Houseret al. teaches that information is 
retrieved from an information distribution center 12 in response to commands from 
terminal unit 16 for accessing information transmitted by information distribution center 
12 (column 5, line 39 to column 6, line 14: Figure 1); additionally, electronic 
programming guide (EPG) data is accessed from an information provider 114-3, 
including television schedule information arranged by time and channel, and transmitted 
to subscriber units (column 22, line 19 to 51: Figure 2C). 

Response to Arguments 

4. Applicants' arguments filed 04 June 2008 have been considered but are moot in 
view of the new grounds of rejection, necessitated by amendment. 

Conclusion 

5. Applicants' amendment necessitated the new grounds of rejection presented in 
this Office Action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 

§ 706.07(a). Applicants are reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 
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A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lemer whose telephone number is (571) 272- 
7608. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David R. Hudspeth can be reached on (571) 272-7843. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
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Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



/Martin Lerner/ 
Primary Examiner 
Art Unit 2626 
August 1, 2008 



